AITopics | Bakersfield

Collaborating Authors

Bakersfield

The Perverse, Tender Worlds of Paul Thomas Anderson

The New YorkerMar-9-2026, 10:00:00 GMT

The filmmaker behind "One Battle After Another" specializes in stories about people who are cut off, adrift, desperately seeking connection. His films are studies of American loneliness. The director plunges us into the physical realization of experience with a thoroughness that can be unsettling. What is the sound of a needle entering fabric? Something more significant, it seems, than the sound of one hand clapping. You hear a tiny pop followed by the rustle of violated muslin--a shudder in the silence of the universe. Scrupulous directors make sure that the sound of their movies is grossly efficient, so that the dramatic meaning of a scene is apparent even in the worst theatre or home system in the country. They also layer in, for those who care about such things, a secondary level of sound--think of the swishing skirts in Martin Scorsese's adaptation of Edith Wharton's "The Age of Innocence." In " Phantom Thread " (2017)--the needle-and-fabric movie--the director, Paul Thomas Anderson, uses such details to build an exquisitely perceptible epic of minute events.

anderson, artificial intelligence, movie, (8 more...)

The New Yorker

Country:

North America > United States > New York (0.05)
North America > Mexico (0.04)
North America > United States > Texas > Presidio County > Marfa (0.04)
(3 more...)

Industry:

Media > Film (1.00)
Leisure & Entertainment (1.00)

Technology: Information Technology > Artificial Intelligence (0.68)

Add feedback

DataComp-LM: Insearchofthenextgenerationof trainingsetsforlanguagemodels

Neural Information Processing SystemsFeb-8-2026, 17:45:09 GMT

Asabaseline for DCLM, we conduct extensive experiments and find that model-based filtering is key to assembling a high-quality training set.

large language model, machine learning, urlhttp, (22 more...)

Neural Information Processing Systems

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.14)
North America > United States > Texas > Travis County > Austin (0.14)
(27 more...)

Genre: Research Report (1.00)

Industry:

Government (1.00)
Education (1.00)
Law (0.92)
(3 more...)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(5 more...)

Add feedback

Rare, deep-sea encounter: California scientists observe 'extraordinary' seven-arm octopus

Los Angeles TimesDec-13-2025, 00:05:57 GMT

Things to Do in L.A. Tap to enable a layout that focuses on the article. Rare, deep-sea encounter: California scientists observe'extraordinary' seven-arm octopus On November 6, 2025, MBARI Senior Scientist Steven Haddock and researchers in MBARI's Biodiversity and Biooptics Team observed a seven-arm octopus (Haliphron atlanticus) during an expedition in Monterey Bay with MBARI's remotely operated vehicle at a depth of approximately 700 meters. This is read by an automated voice. Please report any issues or inconsistencies here . California scientists captured rare footage of a seven-arm octopus eating a jellyfish.

artificial intelligence, california scientist, social media, (16 more...)

Los Angeles Times

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.07)
Pacific Ocean (0.05)
North America > United States > Washington (0.05)
(8 more...)

Industry:

Health & Medicine (1.00)
Media > News (0.71)
Law > Criminal Law (0.48)
(2 more...)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence (0.91)

Add feedback

The Most Dangerous Genre

The New YorkerNov-18-2025, 11:00:00 GMT

Our obsession with deadly game shows--from "The Running Man" and "Squid Game" to MrBeast's real-life reënactments--reflects a shift in the national mood to something increasingly zero-sum. It seems we can't get enough of game shows in which the losers die. "The Hunger Games" became a multibillion-dollar media franchise over the past decade, with audiences returning to the theatre, time and time again, to watch adolescents try to kill one another in an enormous arena--a contest devised by the leaders of a society rife with inequality. Netflix's " Squid Game " followed four hundred and fifty-six desperate individuals into an underworld where they play lethal versions of children's games in the hope of winning a life-changing amount of money. Four weeks after its release, the show had become Netflix's most-watched series ever; to date, the first season has been viewed more than two hundred and sixty-five million times.

artificial intelligence, dangerous genre, game show, (16 more...)

The New Yorker

Country:

North America > United States > New York (0.05)
North America > United States > California > Kern County > Bakersfield (0.05)
Europe > Ukraine > Kyiv Oblast > Chernobyl (0.04)
Asia > Middle East > Syria (0.04)

Industry:

Media > Television (1.00)
Media > Film (1.00)
Leisure & Entertainment (1.00)

Technology: Information Technology > Artificial Intelligence (0.47)

Add feedback

19e4ea30dded58259665db375885e412-Paper-Datasets_and_Benchmarks_Track.pdf

Neural Information Processing SystemsOct-9-2025, 19:55:07 GMT

chain-of-thought prompting, commonsense knowledge, obscure typographical measurement system, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Texas > Travis County > Austin (0.27)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.14)
(31 more...)

Genre:

Research Report > New Finding (0.92)
Research Report > Experimental Study (0.67)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Government (1.00)
(2 more...)

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Security & Privacy (1.00)
Information Technology > Data Science > Data Quality (1.00)
(11 more...)

Add feedback

The study of short texts in digital politics: Document aggregation for topic modeling

Nakka, Nitheesha, Yalcin, Omer F., Desmarais, Bruce A., Rajtmajer, Sarah, Monroe, Burt

arXiv.org Artificial IntelligenceMar-6-2025

Statistical topic modeling is widely used in political science to study text. Researchers examine documents of varying lengths, from tweets to speeches. There is ongoing debate on how document length affects the interpretability of topic models. We investigate the effects of aggregating short documents into larger ones based on natural units that partition the corpus. In our study, we analyze one million tweets by U.S. state legislators from April 2016 to September 2020. We find that for documents aggregated at the account level, topics are more associated with individual states than when using individual tweets. This finding is replicated with Wikipedia pages aggregated by birth cities, showing how document definitions can impact topic modeling results.

football, legislator, university, (15 more...)

arXiv.org Artificial Intelligence

2503.05065

Country:

North America > United States > Maryland > Baltimore (0.28)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > California > Los Angeles County > Los Angeles (0.14)
(106 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Media > News (1.00)
Media > Music (1.00)
Media > Film (1.00)
(16 more...)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science > Data Mining (0.93)

Add feedback

UniMEL: A Unified Framework for Multimodal Entity Linking with Large Language Models

Qi, Liu, Yongyi, He, Defu, Lian, Zhi, Zheng, Tong, Xu, Che, Liu, Enhong, Chen

arXiv.org Artificial IntelligenceAug-20-2024

Multimodal Entity Linking (MEL) is a crucial task that aims at linking ambiguous mentions within multimodal contexts to the referent entities in a multimodal knowledge base, such as Wikipedia. Existing methods focus heavily on using complex mechanisms and extensive model tuning methods to model the multimodal interaction on specific datasets. However, these methods overcomplicate the MEL task and overlook the visual semantic information, which makes them costly and hard to scale. Moreover, these methods can not solve the issues like textual ambiguity, redundancy, and noisy images, which severely degrade their performance. Fortunately, the advent of Large Language Models (LLMs) with robust capabilities in text understanding and reasoning, particularly Multimodal Large Language Models (MLLMs) that can process multimodal inputs, provides new insights into addressing this challenge. However, how to design a universally applicable LLMs-based MEL approach remains a pressing challenge. To this end, we propose UniMEL, a unified framework which establishes a new paradigm to process multimodal entity linking tasks using LLMs. In this framework, we employ LLMs to augment the representation of mentions and entities individually by integrating textual and visual information and refining textual information. Subsequently, we employ the embedding-based method for retrieving and re-ranking candidate entities. Then, with only ~0.26% of the model parameters fine-tuned, LLMs can make the final selection from the candidate entities. Extensive experiments on three public benchmark datasets demonstrate that our solution achieves state-of-the-art performance, and ablation studies verify the effectiveness of all modules. Our code is available at https://github.com/Javkonline/UniMEL.

dataset, information, multimodal entity, (14 more...)

arXiv.org Artificial Intelligence

2407.1616

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Idaho > Ada County > Boise (0.05)
Asia > China > Anhui Province > Hefei (0.05)
(21 more...)

Genre: Research Report (0.82)

Industry:

Government (0.68)
Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

DataComp-LM: In search of the next generation of training sets for language models

Li, Jeffrey, Fang, Alex, Smyrnis, Georgios, Ivgi, Maor, Jordan, Matt, Gadre, Samir, Bansal, Hritik, Guha, Etash, Keh, Sedrick, Arora, Kushal, Garg, Saurabh, Xin, Rui, Muennighoff, Niklas, Heckel, Reinhard, Mercat, Jean, Chen, Mayee, Gururangan, Suchin, Wortsman, Mitchell, Albalak, Alon, Bitton, Yonatan, Nezhurina, Marianna, Abbas, Amro, Hsieh, Cheng-Yu, Ghosh, Dhruba, Gardner, Josh, Kilian, Maciej, Zhang, Hanlin, Shao, Rulin, Pratt, Sarah, Sanyal, Sunny, Ilharco, Gabriel, Daras, Giannis, Marathe, Kalyani, Gokaslan, Aaron, Zhang, Jieyu, Chandu, Khyathi, Nguyen, Thao, Vasiljevic, Igor, Kakade, Sham, Song, Shuran, Sanghavi, Sujay, Faghri, Fartash, Oh, Sewoong, Zettlemoyer, Luke, Lo, Kyle, El-Nouby, Alaaeldin, Pouransari, Hadi, Toshev, Alexander, Wang, Stephanie, Groeneveld, Dirk, Soldaini, Luca, Koh, Pang Wei, Jitsev, Jenia, Kollar, Thomas, Dimakis, Alexandros G., Carmon, Yair, Dave, Achal, Schmidt, Ludwig, Shankar, Vaishaal

arXiv.org Artificial IntelligenceJun-20-2024

We introduce DataComp for Language Models (DCLM), a testbed for controlled dataset experiments with the goal of improving language models. As part of DCLM, we provide a standardized corpus of 240T tokens extracted from Common Crawl, effective pretraining recipes based on the OpenLM framework, and a broad suite of 53 downstream evaluations. Participants in the DCLM benchmark can experiment with data curation strategies such as deduplication, filtering, and data mixing at model scales ranging from 412M to 7B parameters. As a baseline for DCLM, we conduct extensive experiments and find that model-based filtering is key to assembling a high-quality training set. The resulting dataset, DCLM-Baseline enables training a 7B parameter language model from scratch to 64% 5-shot accuracy on MMLU with 2.6T training tokens. Compared to MAP-Neo, the previous state-of-the-art in open-data language models, DCLM-Baseline represents a 6.6 percentage point improvement on MMLU while being trained with 40% less compute. Our baseline model is also comparable to Mistral-7B-v0.3 and Llama 3 8B on MMLU (63% & 66%), and performs similarly on an average of 53 natural language understanding tasks while being trained with 6.6x less compute than Llama 3 8B. Our results highlight the importance of dataset design for training language models and offer a starting point for further research on data curation.

chain-of-thought prompting, training hyperparameter, training language model, (15 more...)

arXiv.org Artificial Intelligence

2406.11794

Country:

North America > United States > Texas > Travis County > Austin (0.27)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.14)
(30 more...)

Genre:

Research Report > New Finding (0.87)
Research Report > Experimental Study (0.65)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Government (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Cellular Traffic Prediction Using Online Prediction Algorithms

Mehri, Hossein, Chen, Hao, Mehrpouyan, Hani

arXiv.org Artificial IntelligenceMay-8-2024

The advent of 5G technology promises a paradigm shift in the realm of telecommunications, offering unprecedented speeds and connectivity. However, the efficient management of traffic in 5G networks remains a critical challenge. It is due to the dynamic and heterogeneous nature of network traffic, varying user behaviors, extended network size, and diverse applications, all of which demand highly accurate and adaptable prediction models to optimize network resource allocation and management. This paper investigates the efficacy of live prediction algorithms for forecasting cellular network traffic in real-time scenarios. We apply two live prediction algorithms on machine learning models, one of which is recently proposed Fast LiveStream Prediction (FLSP) algorithm. We examine the performance of these algorithms under two distinct data gathering methodologies: synchronous, where all network cells report statistics simultaneously, and asynchronous, where reporting occurs across consecutive time slots. Our study delves into the impact of these gathering scenarios on the predictive performance of traffic models. Our study reveals that the FLSP algorithm can halve the required bandwidth for asynchronous data reporting compared to conventional online prediction algorithms, while simultaneously enhancing prediction accuracy and reducing processing load. Additionally, we conduct a thorough analysis of algorithmic complexity and memory requirements across various machine learning models. Through empirical evaluation, we provide insights into the trade-offs inherent in different prediction strategies, offering valuable guidance for network optimization and resource allocation in dynamic environments.

algorithm, complexity, scenario, (15 more...)

arXiv.org Artificial Intelligence

2405.05239

Country:

North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.07)
North America > United States > Idaho > Ada County > Boise (0.05)
North America > United States > New York > Onondaga County > Syracuse (0.04)
(8 more...)

Genre: Research Report (1.00)

Industry: Telecommunications (1.00)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

RACH Traffic Prediction in Massive Machine Type Communications

Mehri, Hossein, Chen, Hao, Mehrpouyan, Hani

arXiv.org Artificial IntelligenceMay-8-2024

Traffic pattern prediction has emerged as a promising approach for efficiently managing and mitigating the impacts of event-driven bursty traffic in massive machine-type communication (mMTC) networks. However, achieving accurate predictions of bursty traffic remains a non-trivial task due to the inherent randomness of events, and these challenges intensify within live network environments. Consequently, there is a compelling imperative to design a lightweight and agile framework capable of assimilating continuously collected data from the network and accurately forecasting bursty traffic in mMTC networks. This paper addresses these challenges by presenting a machine learning-based framework tailored for forecasting bursty traffic in multi-channel slotted ALOHA networks. The proposed machine learning network comprises long-term short-term memory (LSTM) and a DenseNet with feed-forward neural network (FFNN) layers, where the residual connections enhance the training ability of the machine learning network in capturing complicated patterns. Furthermore, we develop a new low-complexity online prediction algorithm that updates the states of the LSTM network by leveraging frequently collected data from the mMTC network. Simulation results and complexity analysis demonstrate the superiority of our proposed algorithm in terms of both accuracy and complexity, making it well-suited for time-critical live scenarios. We evaluate the performance of the proposed framework in a network with a single base station and thousands of devices organized into groups with distinct traffic-generating characteristics. Comprehensive evaluations and simulations indicate that our proposed machine learning approach achieves a remarkable $52\%$ higher accuracy in long-term predictions compared to traditional methods, without imposing additional processing load on the system.

algorithm, prediction, traffic, (17 more...)

arXiv.org Artificial Intelligence

2405.05235

Country:

North America > United States > Idaho > Ada County > Boise (0.05)
North America > United States > New York > Onondaga County > Syracuse (0.04)
North America > United States > New Jersey > Hudson County > Hoboken (0.04)
(6 more...)

Genre: Research Report > Promising Solution (0.34)

Industry:

Telecommunications (1.00)
Government > Regional Government (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback